[UI] Sequence Extraction Quickstart (GPT-3.5-Turbo)
Sequence Extraction Test with DynamoEval UI (GPT-3.5-Turbo)
Last updated: October 6th, 2024
This Quickstart showcases an end-to-end walkthrough of how to run Sequence Extraction tests with DynamoEval. It also covers some general guidelines and specific examples for setting up test configurations.
If you are a developer and wish to follow the same quickstart with Dynamo AI’s SDK, we refer to the associated SDK Quickstart.
Create Model
Begin by navigating to the Dynamo home page. This page contains the model registry – a collection of all the models you have uploaded for DynamoEval or DynamoGuard. The model registry contains information to help you identify your model, such as the model source, use case, and date updated.
For the Sequence Extraction test on 10K Reports data, please follow the below instructions to add a fine-tuned GPT 3.5 model to your model registry:
- To upload a new model to the registry, click the Upload new model button.
- A popup will appear, requesting information such as Model name and Model source. *Remote inference can be used to create a connection with any model that is provided by a third party or is already hosted and can be accessed through an API endpoint. Local inference can be used for custom model file or HuggingFace Hub id. *
Example. For this quickstart, we recommend setting the following:
- Model name: GPT 3.5 10K Reports
- Model Source: Remote Inference
The next page of the popup will ask for more detailed information about the model provider, API key, model identifier, as well as an optional model endpoint (if required by your API provider).
Example. For this quickstart, we recommend setting the following:
- API Provider: OpenAI
- API Key: OpenAI API key provided to you
- Model: gpt-3.5-turbo
- Endpoint: leave blank
![]() | ![]() |
---|
At this point, your model should have been created and should be displayed on the models registry.
Click on the DynamoEval link on the right, then navigate to the tabs Testing > New Test to start creating a test for this model.
Create Test
- Fill in the test title to be indicative of the test you are running.
- Select Privacy Tests
- Select Test Type: Sequence Extraction.
1. Select an Existing Dataset
After selecting the test type, you’ll then be asked to select a dataset. Here, you can select an existing dataset that has been previously uploaded. Click on the checkbox next to the dataset name. Skip to the next section if you are using the platform for the first time.
2. OR Upload a New Dataset
Alternatively, you can upload a new dataset by clicking “Upload custom dataset”. On the pop-up sidebar, you’ll be asked to provide a dataset name and description. We recommend you to be specific so you can clearly identify the dataset in the future.
Identify an access level. Finally, you’ll be asked to upload a dataset. Currently, Dynamo AI supports running attacks and evaluations on CSV datasets (Local Dataset), or those hosted on the HuggingFace hub (HuggingFace Dataset).
- For HuggingFace Dataset (left), you will be asked to fill in the Dataset ID and the access
token, which will be required if the dataset is private.
- For Local Dataset (right), you will be asked to drag and drop the CSV file.
For this quickstart, we recommend setting the following:
- Dataset name: 10K Reports Finetuning Dataset
- Description / Access: leave as default
- Dataset Type: Local Dataset
- Uploaded file: 10k_dataset.csv
After uploading, you will be asked to select (checkbox) the dataset you just created. Hitting “Next” will bring you to the following Dataset configuration page. Make sure the Dataset Column Name is set to prompt.
![]() | ![]() |
---|
Test Parameters Setup
This page will allow you to vary different test parameters to observe performance across different settings. For DynamoEval Sequence Extraction tests, you can vary the following parameters:
- Temperature: this controls the randomness of the generation when it depends on the temperature value for decoding
- Sequence length: this controls the amount of tokens generated from the model when performing the sequence extraction attack
- Sampling rate: this controls the number of queries made to the model to estimate vulnerability to sequence leakage
- Memorization Granularity: either "paragraph" or "sentence", controls the level of "granularity" that is measured for the sample.
- Is Fine-tuned: Select true if you are using a fine-tuned model and would like to see whether the fine-tuned model memorized the contents of the fine-tuning dataset.
For this quickstart, we recommend setting the values as specified below:
Dataset:
- Text Column:
text
- Title Column:
title
Test Hyperparameters:
- Temperature:
0
- Is Fine-tuned:
false
- Sampling rate:
500
- Sequence length:
500
- Prompt length:
128
- Memorization Granularity:
paragraph
Verifying Test Summary
Finally, verify that your test summary looks like this:
Checking Results
After queueing the test, you will see three indicators on the model’s Testing tab: Complete, In Progress, Awaiting Resources.
Once the test is marked complete, you can look through the rest results in 3 different ways:
- Dashboard: In the Dashboard tab, examine the key metrics such as # Sequence Extracted, Precision and Recall.
- Deep-dive: Under the Testing tab, click on “View Test Details” for the sequence Extraction Attack section to examine the results for each inference from the model.
- See report: Under the Testing tab, click on the drop down arrow on the right for Sequence Extraction Attack”, and click “Download report” to view the generated Sequence Extraction report.